ENRICH: Building a European Digital Library of Manuscripts

نویسنده

  • Adolf KNOLL
چکیده

The Manuscriptorium Digital Library provides access to catalogue data, fully digitized documents, and selected structured full texts from more than thirty various memory institutions in country and abroad. It is the largest digital library of manuscripts operated by any national library in Europe. Thanks to the new EU project ENRICH. Manuscriptorium, as a product based on 15 years of cooperation of the Czech National Library and AiP Beroun Co. Ltd., will try to aggregate similar information from many countries in Europe to provide seamless access to already dispersed historical documents. The works on which the Manuscriptorium is based were the main reason why the National Library of the Czech Republic was awarded the UNESCO Memory of the World prize in 2005 as the first institution ever. The paper gives an overview of plans and goals to be achieved during the coming two years. When in 1992, a few people from the National Library and the company whose today’s name is AiP Beroun Ltd. had decided to try to comply with a request coming from UNESCO for production of a CD concerning old books and manuscripts, none of us could know that some more years later this would lead towards consideration to aggregate similar content from more European institutions and projects under a digital library interface that got the name of Manuscriptorium. So this is our story in brief. Manuscriptorium is the largest manuscript digital library operated by any European national library. It contains data from more than thirty institutions from the Czech Republic and abroad. These data are catalogue records, images, and recently also structured historical full texts. The volume of available digitized pages is ca. 800,000; they represent mostly manuscripts (about 65%), but also old rare printed books and historical maps. The proportion of the collections of the National Library is about 55%, while as to foreign institutions; we have mostly catalogue data (but also images) from the countries such as Slovakia, Poland, Turkey, Lithuania, Hungary, and in a very small proportion also from some other ones. Behind the Manuscriptorium Digital Library, solid standardization issues create the platform for any further development. The digital documents comply with a complex public XML schema whose descriptive part is based on the special MASTER format for electronic description of manuscripts, developed in an EU project of the same name in the beginning of the new millennium. The structural metadata reflect our long-term experience with the work with digitized manuscripts in 1990s, while the technical metadata are based mostly on the 1 http://digit.nkp.cz/MMSB/1.1/msnkaip.xsd NISO Z39.87-2002 standard called Data Dictionary for Still Digital Images and the DIG35 Metadata Specification standard issued by the International Imaging Industry Association. The image representation is mostly done by up to five images for one digitized page; these images are different as to resolution, compression quality factor, and format. In fact, the web formats are recommended, i.e. JPEG (mostly used for low and normal user quality images) and GIF or PNG (for black-and-white, thumbnail gallery images, and preview images – for the latter ones sometimes JPEG is used, too). For digital representation of historical maps the MrSID format (*.sid) is applied in combination with the Lizardtech Image Express server, which enables comfortable work with large data files of maps. The full text is structured following our own DTD based on the TEI standard. The digitization for Manuscriptorium in the Czech Republic is supported by a programme of the Ministry of Culture, which issues annual calls for proposals used by libraries of various institutions – incl. monasteries, castles, museums, or archives – to get up to 70% of funds for digitization of concrete titles from their collections. As the manuscripts and old printed books of the same provenance are dispersed in so many collections in all over the world, we started to try to aggregate similar content also from outside of our country. We had to solve various standardization issues to be able to accept data from different resources. For the catalogue records, this meant mostly preparation of tuned conversion rules/transformations for cooperating institutions, especially from various MARC applications, into our internal MASTER format. However, not only descriptive metadata are necessary; we would like to gather under the same interface also the visual representation of digitized documents to provide seamless access to dispersed resources for our users. The manuscripts in other institutions are represented digitally in various manners: the most frequent approach is that the institutions have only the catalogue data, mostly in various clones of MARC formats, some of them also in MASTER, while other ones in some formats that are very specific. From practice, we know that the same format does not mean the same approach so that even the general transformations MARC21/UNIMARC > MASTER must be checked when having concrete results and adapted. We call such adapted transformations connectors, because they are able to connect the partner’s metadata with our database. We think that the variety of formats is not so much a problem as it can be imperfect application of descriptive rules. In many cases, it makes sense to write a specific conversion utility when the collection is important and the application of proprietary rules is consistent. In general, the institutions are willing to share their catalogue data, but the problems arise when we start to talk about sharing the visual representations of originals. The reasons for this may be of cultural and political origins or technical. Sometimes, it seems that it is a mixture of both. For cultural reasons, we base the aggregation of various types of content on shared Internet storage of image or other data. In this way, we just need to upload and index the correct metadata file describing the whole manuscript with working structural links leading to 2 Data Dictionary Technical Metadata for Digital Still Images. Released as a Draft Standard for Trial Use June 1, 2002 – December 31, 2003. Recently published as a working draft version 1.2 (Working Draft, 1.2 March 27, 2007 http://www.loc.gov/standards/mix/docs/NISODataDictionary_March2002.doc) 3 http://www.i3a.org 4 http://digit.nkp.cz/MSSFullText/DTD/1.00/mss-fulltext.dtd

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Homogenizing Access to Heterogeneous Resources of Digitized Manuscripts

The Manuscriptorium Digital Library is seamless aggregator of data from ca. 100 memory institutions in Europe. It offers tools for data contributors as well as for user-oriented personalization. The metadata from Manuscriptorium are used by many other services, such as The European Library, EUROPEANA, or SUMMON and EBSCO resource discovery services.

متن کامل

Europeana: Towards The European Digital Library

This paper briefly describes the process that will lead to Europeana, the European Digital Library. This process is currently running, so that it is possible to give only an account of its inception, involved actors and projects, and current status. The paper concludes by quickly outlining the role that CNR has in the making of Europeana. 1 Europeana: the inception In October 2004 Google launch...

متن کامل

Towards a Premodern Manuscript Application Profile

Individuals who wish to develop digital scholarly works and libraries that wish to provide access to their precious and fragile holdings have an interest in digitizing premodern manuscripts. These handmade objects are often beautiful and each one is unique. The features of zooming and light alteration available through digital photography and manipulation are assets to medieval scholars, becaus...

متن کامل

Agile DL: Building a DELOS-ConformedDigital Library Using Agile Software Development

This paper describes a concrete partial implementation of the DELOS Reference Model to the particular field of manuscripts and incunabula, and how an agile software methodology, SCRUM, suits the evolutive nature of Digital Libraries, solving misunderstandings and lightening the underlying model.

متن کامل

Development of Quality Performance of National Digital Library with Kano's Model Approach

Background and Aim: The purpose of this study is to determine the quality requirements of the National Digital Library based on the Kano model and categorize users needs into three groups of:  Basic, functional and motivational. Methods: This survey was conducted with a qualitative approach. The requirements of the digital library were extracted using two standards: "Digiqual manual" and the "D...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007